Spectrogram based multi-task audio classification
نویسندگان
چکیده
منابع مشابه
Conic Multi-task Classification
Traditionally, Multi-task Learning (MTL) models optimize the average of task-related objective functions, which is an intuitive approach and which we will be referring to as Average MTL. However, a more general framework, referred to as Conic MTL, can be formulated by considering conic combinations of the objective functions instead; in this framework, Average MTL arises as a special case, when...
متن کاملUniversität Augsburg Audio Brush : Editing Audio in the Spectrogram
A tool for editing audio signals in the spectrogram is presented. It allows manipulating the spectrogram of a signal at any chosen time-frequency resolution directly and to reconstruct the edited signal in HiFi quality – a capability that is usually not possible with the Fourier or wavelet transformation. Image processing and computer vision methods are applied to the spectrogram in order to id...
متن کاملUniversität Augsburg Audio Brush : Smart Audio Editing in the Spectrogram
Starting with a novel audio analysis and editing paradigm, a set of new and adaptive audio analysis and editing algorithms in the spectrogram are developed and integrated into a smart visual audio editing tool in a “what you see is what you hear” style. At the core of our algorithms and methods is a very flexible audio spectrogram that goes beyond FFT and Wavelets and supports manipulating a si...
متن کاملClassification of Fricatives Using Novel Modulation Spectrogram Based Features
In this paper, we propose the use of a novel feature set, i.e., modulation spectrogram for fricative classification. Modulation spectrogram gives 2-dimensional (i.e., 2-D) feature vector for each phoneme. Higher Order Singular Value Decomposition (HOSVD) is used to reduce the size of large dimensional feature vector obtained by modulation spectrogram. These features are then used to classify th...
متن کاملMulti-timescale Pmscs for Music Audio Classification
Principal mel-spectrum components (PMSCs) [3] are computed several timescales in parallel [2]. For each timescale, the feature extraction involves four steps: discrete Fourier transform (DFT), mel-scaling, principal component analysis whitening (PCA) and temporal pooling. Firstly, for each timescale, we compute discrete Fourier transforms over a given time length. To compute PMSCs at different ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Multimedia Tools and Applications
سال: 2017
ISSN: 1380-7501,1573-7721
DOI: 10.1007/s11042-017-5539-3